Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 76
Filter
1.
Organ Transplantation ; (6): 435-2023.
Article in Chinese | WPRIM | ID: wpr-972935

ABSTRACT

Objective To evaluate the policy of human organ transplantation in China, aiming to provide theoretical basis for further optimizing the policy of human organ transplantation. Methods Based on text mining and statistical analysis, seven normative policies of human organ transplantation formulated by national government from 2000 to 2022 were quantitatively evaluated by constructing policy modeling consistency (PMC) with 10 first-level variables and 35 second-level variables. Results Among the seven policies, six were graded as excellent policies and one as perfect policy, with an average PMC index of 8.476. Except X8 policy audience, the scores of other second-level variables of P5 were higher than or equal to the mean. The scores of all second-level variables of P1 were lower than or equal to the mean. P1 and P5 significantly differed in X3 policy timeliness, X4 policy norms and X6 policy tools. P5 was more specific and relatively comprehensive in these aspects, and its score was significantly higher than that of P1. Conclusions Human organ transplantation policies in China are generally excellent, scientific and rational. Health administrative departments at all levels should pay attention to the grasp of policy timeliness, the combination of policy tools, and fully mobilize the initiative and enthusiasm of all policy audience to participate in organ transplantation management when formulating organ transplantation policies.

2.
rev. udca actual. divulg. cient ; 25(1): e1947, ene.-jun. 2022. graf
Article in Spanish | LILACS-Express | LILACS | ID: biblio-1395198

ABSTRACT

RESUMEN Las herramientas automatizadas de análisis de texto resumen grandes volúmenes de información y permiten generar, de forma eficiente, conocimiento a partir de datos desestructurados, como las opiniones. El objetivo de este trabajo fue identificar prioridades en comunidades afectadas por el conflicto armado, utilizando ejercicios participativos de 13 municipios de Antioquia, Colombia. Se analizaron 15.534 opiniones, de 9.765 personas; tras una limpieza de texto, se describió el uso, la asociación, la diferenciación y la importancia de los términos, según los enfoques temáticos y tipos de opinión expresados, utilizando minería de texto en R. Se encontró que las prioridades giraban en torno a la disponibilidad de infraestructuras, dotación e insumos, ya que eran las problemáticas más mencionadas por las comunidades y correspondía con la realidad territorial; por otra parte, las oportunidades estaban representadas, principalmente, por los recursos naturales y humanos. El análisis de minería de texto del ejercicio participativo permitió identificar las prioridades de las comunidades, a nivel socioeconómico, de forma satisfactoria; sin embargo, la preparación de la información requiere mucho trabajo y los resultados se deben revisar cuidadosamente, para asegurar su coherencia. Que la información pueda ser analizada por agentes externos a la colección de datos, representa otra ventaja de esta herramienta.


ABSTRACT Automated text analysis tools summarize large volumes of information and allow efficient generation of insights from unstructured data such as opinions. The objective of this work was to identify priorities in communities affected by the armed conflict using participatory exercises in 13 municipalities of Antioquia, Colombia. 15,534 opinions from 9,765 people were analyzed. After a text cleaning, the use, association, differentiation and importance of the terms were described according to the thematic approaches and types of opinion expressed using text mining in R. It was found that the priorities revolved around the availability of infrastructures, endowment and inputs, since they were the most mentioned problems by the communities, and that this corresponded to the territorial reality. On the other hand, the opportunities were mainly represented by natural and human resources. The text mining analysis of the participatory exercise allowed to identify the priorities of the communities at the socio-economic level in a satisfactory way. However, the preparation of the information is labor intensive and the results must be carefully reviewed to ensure consistency. Another advantage of this tool is that the information can be analyzed by external agents to the data collection.

3.
Medicina (B.Aires) ; 82(4): 513-524, 20220509. graf
Article in Spanish | LILACS-Express | LILACS | ID: biblio-1405696

ABSTRACT

Resumen El síndrome urémico hemolítico (SUH) está caracterizado por microangiopatía trombótica, anemia hemolítica, trombocitopenia e insuficiencia renal aguda. Puede causar desde secuelas permanentes hasta muerte, principalmente en niños. En este trabajo, utilizando minería de textos (MT), se analizó el texto explícito e implícito de 16 192 artículos científicos originales sobre SUH indexados en la base de datos de Europe PMC. Los objetivos fueron examinar comportamientos, realizar seguimiento de tendencias, hacer predicciones y cruzar datos con otras fuentes de información. Para el análisis se utilizaron -entre otras herramientas infor máticas- flujos de trabajo (FT) especialmente desarrollados en la plataforma KNIME. La MT sobre las palabras de los resúmenes de las publicaciones permitió: detectar asociaciones no descritas entre eventos relacionados con SUH; extraer información subyacente; hacer agrupamientos temáticos mediante algoritmos no supervisados; realizar predicciones sobre el curso de las investigaciones asociadas al tema. Tanto el abordaje como los FT desarrollados para realizar Ciencia de Datos sobre SUH pueden aplicarse a otros temas biomédicos y a otras bases de datos científicos, permitiendo analizar aspectos relevantes en el campo de la salud humana para me jorar la investigación, la prevención y el tratamiento de múltiples enfermedades.


Abstract Hemolytic uremic syndrome (HUS) is characterized by thrombotic microangiopathy, hemolytic anemia, thrombocytopenia and acute renal failure. It can cause from permanent sequelae to death, mainly in children. In this work, using text mining (TM), we analyzed the explicit and implicit text of 16 192 original scientific articles on HUS indexed in the Europe PMC database. The objectives were to examine behaviors, track trends, and make predictions and cross-check data with other sources of information. For the analysis we used -among other computational tools- specially developed workflows (WF) in the KNIME platform. The TM on the words of the abstracts of the publications made it possible to: detect undescribed associations between events related to HUS; extract underly ing information; make thematic clustering using unsupervised algorithms; make forecasting about the course of research associated with the topic. Both the approach and the WFs developed to perform Data Science on HUS can be applied to other biomedical topics and other scientific databases, making it possible to analyze relevant aspects in the field of human health to improve research, prevention and treatment of multiples diseases.

4.
Japanese Journal of Social Pharmacy ; : 28-31, 2022.
Article in Japanese | WPRIM | ID: wpr-936647

ABSTRACT

We evaluated the role of pharmacists in an interdisciplinary pain center using text mining analysis. We investigated 28 patients who visited an interdisciplinary pain center from May 2014 to July 2015. All patients were interviewed by a pharmacist. Further, we performed morphological analysis of medical records; classification of appearing words into “medicines/side effects,” “diagnosis/disease name,” “pain site,” “pain characteristics/concomitant symptoms,” “life/environment,” and “mental”; and correspondence analysis. The frequently appearing words “pain characteristics/concomitant symptoms” and “medicines/side effects” were used by 47.2% doctors and 35.3% pharmacists, respectively. In the correspondence analysis, doctors frequently referred to “pain characteristics/concomitant symptoms,” pharmacists frequently referred to “medicines/side effects,” and nurses frequently referred to “life/environment” and “pain site.” The fact that the three occupations used distinguishing phrases suggests that each is specialized in a distinct area. At an interdisciplinary pain center, we interviewed a nurse, a pharmacist, and a doctor, and shared information from various angles. The pharmacist focused on listening to the “medicines/side effects,” which is information related to his profession. Pharmacists contribute to medical care by recording information in medical records and sharing the information with other occupations. It is necessary to continue to provide information related to our specialized profession, respect each other, and provide high-quality medical care.

5.
Medicina (B.Aires) ; 81(2): 214-223, June 2021. graf
Article in Spanish | LILACS | ID: biblio-1287273

ABSTRACT

Resumen En el presente trabajo utilizamos la minería de texto como herramienta de tratamiento de una gran base de datos científica, con el objetivo de obtener nueva información de todas las publicaciones firmadas por autores argentinos e indexadas hasta 2019 en el área de las ciencias de la vida. Se analizaron más de 75 000 artículos, publicados en alrededor de 5000 medios, firmados por cerca de 186 000 autores con lugar de trabajo en la Argentina o en colaboraciones con laboratorios argentinos. Mediante herramientas automatizadas, que fueron desarrolladas ad hoc, se analizó el texto de alrededor de 70 800 resúmenes y se buscaron, mediante detección digital no supervisada, los principales temas abordados, su relación con problemáticas de salud en la Argentina y su tratamiento. Se presentan, además, resultados del número de publicaciones por año, las revistas que las publicaron, y sobre sus autores y colaboraciones. Estos resultados, junto con las predicciones que se obtuvieron, podrían constituirse en una herramienta útil para optimizar el manejo de recursos dedicados a la investigación básica y clínica.


Abstract In the present work we use text mining as a treatment tool for a large scientific database, with the aim of obtaining new information about all the publications signed by Argentine authors and indexed until 2019, in the area of life sciences. More than 75 000 articles were analysed, published in around 5000 media, signed by about 186 000 authors with a workplace in Argentina or in collaborations with Argentine laboratories. Using automated tools that were developed ad hoc, the text of around 70 800 abstracts was analysed, seeking, through non-supervised digital detection, the main topics addressed by the authors, and the relationship with health problems in Argentina and their treat ment. Results are also presented regarding the number of publications per year, the journals that have published them, and their authors and collaborations. These results, together with the predictions that were obtained, could become a useful tool to optimize the management of resources dedicated to basic and clinical research.


Subject(s)
Humans , Data Mining , Argentina
6.
Journal of Biomedical Engineering ; (6): 197-209, 2021.
Article in Chinese | WPRIM | ID: wpr-879267

ABSTRACT

In order to understand the evolution of the diagnosis and treatment plans of corona virus disease 2019 (COVID-19), and provide convenience for medical staff in actual diagnosis and treatment, this paper uses the 9 diagnosis and treatment plans of COVID-19 issued by the National Health Commission during the period from January 26, 2020 to August 19, 2020 as research data to perform comparative analysis and visual analysis. Based on text mining, this paper obtained the text similarity and summarized its evolution law by expressing and measuring the similarity of the overall diagnosis and treatment plans of COVID-19 and the same modules, which provides reference for clinical diagnosis and treatment practice and other diagnosis and treatment plan formulation.


Subject(s)
Humans , COVID-19 , Data Mining , SARS-CoV-2
7.
Chinese Journal of Hospital Administration ; (12): 417-419, 2021.
Article in Chinese | WPRIM | ID: wpr-912772

ABSTRACT

Objective:To analyze original medical humanities articles written by staff of the hospital and extract key elements of hospital culture for staff perception, for recommendations on homogenous development of multiple campuses of the hospital.Methods:Original medical humanities articles written by staff of the hospital were collected from the WeChat accounts, websites and printouts of a tertiary hospital in Beijing from January 2013 to December 2019, with text mining made by Python3.7; Bert and Tencent open-source 8 million Chinese word dataset were used as entity identification training dataset; K-Means cluster analysis was used to analyze and select cluster results of higher degree of fitting; TextRank was used to screen keywords of each clustered category. Significance of each category was summarized in the end based on keywords following the screening.Results:Among the 341 articles collected, high frequency words were work, hospital, patient, sick person, development, medicine, medical treatment, doctor, outpatient service and clinic. The words fell into ten categories: process recall, work attribute, work responsibility, diagnosis and treatment behavior, growth experience, team cooperation, professionalism, management innovation, doctor-patient communication and others.Conclusions:Research on hospital staff culture perception is conducive to identifying the cultural content of the hospital. In the homogenous development of multiple campuses of the hospital, it is imperative to nurture the staff′s sense of identify and belonging to the hospital, enhance their sense of gain and happiness, and strengthen the scientific management system centering on patients.

8.
Chinese Journal of Health Management ; (6): 237-242, 2021.
Article in Chinese | WPRIM | ID: wpr-910832

ABSTRACT

Objective:To analyze the public demands for information about congenital birth defects in “Baidu zhidao” based on word frequency retrieval.Methods:Based on discussion between obstetrics and gynecology experts and epidemiological experts, the key words related to congenital birth defects were determined and the search strategy was formulated. Python 2.7 was used for web crawler search. Questions related to congenital birth defects were obtained on the “Baidu zhidao” platform, and then the R 4.0.2 software was used to process the data, complete the semantic analysis of keywords and statistical analysis of word frequency, and draw word cloud graph and polar chart to describe the key results.Results:A total of 16668 non-repetitive questions were retrieved from “Baidu zhidao” platform, and the frequency of semantic words was 15 371. Among them, 35.02% were the names and symptoms of congenital birth defects. In addition, the frequency of congenital heart disease was the highest (26.09%). The results of subject analysis of key words of birth defects showed that the average word frequency of diagnosis and treatment semantic words (49.55) was significantly higher than that of etiology and prevention semantic words (12.47). In addition, the key words of examination, cause, treatment, development and heredity were more frequently used in the semantic words related to the seven types of systemic malformations.Conclusion:The public in China has a high demand for information on congenital birth defect related diseases, and their causes, prevention and treatment, especially congenital heart disease.

9.
Chinese Journal of Experimental Traditional Medical Formulae ; (24): 172-180, 2021.
Article in Chinese | WPRIM | ID: wpr-906221

ABSTRACT

Objective:To construct the knowledge base of Tibetan medicine prescriptions and explore to standardize the names of Tibetan medicine prescriptions. Method:By using the concept of "man-machine combination",through the construction of Tibetan medicine terminology glossary (data sources: national drug standards,local drug standards,text classics on Tibetan medicines,etc.),the terminology glossary of Tibetan medicine prescriptions was mined. Upon its combination with expert review,the text association between Tibetan medicine prescriptions and various drug standards and dictionaries was constructed,and the standardization methods and techniques of prescription drug names were explored. Result:In this paper,the Tibetan medicine prescriptions approved for marketing in China were taken as the research object,and various inconveniences caused by the inconsistency between the names of prescriptions and the names of medicinal herbs were revealed. This paper also discussed the design ideas on name standardization of Tibetan medicines from three levels: text association,optimization of evaluation methods,and formation of expert decision-making system. We put forward a five-in-one (screening, evaluation, reviewing, fixing, and renewing) research model of Tibetan medicine name standardization. The construction,functions and advantages of the database and thesaurus of Tibetan medicine prescriptions were described in detail, and in combination with the text notes, association between the standard medicinal materials and the prepared prescriptions was then established. Conclusion:The text association method in this paper can accurately reflect the nonstandard names of Tibetan medicine prescriptions. Combined with expert review,it can be, to a certain extent, extended to the standardization of herb names in prescriptions with large scale of or more complex network structures.

10.
Journal of Preventive Medicine ; (12): 255-258, 2021.
Article in Chinese | WPRIM | ID: wpr-876539

ABSTRACT

Objective@#To evaluate the accuracy of automated classification of ICD-O-3 morphology code from pathology reports by text-mining and support vector machine ( SVM ) , in order to provide basis for automated tumor coding in Chinese. @*Methods@#The tumor report cards of Zhejiang residents from 2017 to 2019 were collected from Chronic Disease Surveillance Information Management System of Zhejiang Province. According to ICD-O-3, the keywords of the pathology reports were extracted, and SVM was used for automatic classification. The classification results were compared with those of 16 professionals with more than two years of experience in tumor coding, and the accuracy rate, recall rate and F-score were calculated for effect evaluation. @*Results@#Totally 83 082 cases from 2017 to 2019 were included and were categorized into 17 morphological classifications, with 52 877 ( 63.65% ) cases of adenocarcinoma, squamous carcinoma and transitional cell carcinoma. A total of 1 090 keywords were enrolled into main corpus. The total F-score, accuracy rate and recall rate are 85.69, 77.20% and 96.27%, respectively. @*Conclusion@#Text-mining combined with SVM can improve the efficiency of ICD-O-3 morphology coding; however, the accuracy needs to be further improved.

11.
Braz. j. med. biol. res ; 54(12): e11728, 2021. tab, graf
Article in English | LILACS-Express | LILACS | ID: biblio-1345573

ABSTRACT

A close interaction between basic science and applied medicine is to be expected. Therefore, it is important to measure how far apart the field of cell biology and medicine are. Our approach to estimating the distance between these fields was to compare their vocabularies and to quantify the difference in word repertoire. We compared the vocabulary of the title and abstract of articles available in PubMed in two selected high-impact journals in each field: cell biology, medicine, and translational science. Although each journal has its own editorial policy, we showed that within each field there is a small vocabulary difference between the two journals. We developed a word similarity index that can measure how much journals share a common vocabulary. We found a high similarity index between each cell biology (91%), medical (71-74%), and translational journal (65%). In contrast, the comparison between medicine and biology journals produced low correlation values (22-36%), suggesting that their vocabularies are quite dissimilar. Translational medicine journals had medium similarity values when compared to cell biology journals (52-70%) and medicine journals (27-59%). This approach was also performed in 10-year periods to evaluate the evolution of each field. Using the "onomics" strategy presented here, we observed that differences in vocabulary of basic science and medicine have been increasing over time. Since translational medicine has an intermediate vocabulary, we confirmed that translational medicine is an efficient approach to bridge this gap.

12.
Entramado ; 16(1): 252-271, ene.-jun. 2020. tab, graf
Article in Spanish | LILACS-Express | LILACS | ID: biblio-1124740

ABSTRACT

RESUMEN La situación de violencia en Colombia ha producido dinámicas que escapan al control del Estado y dificultan la construcción de la agenda pública de las instituciones estatales, ya que estas poseen limitaciones para describir el entorno social y político que influye en el territorio. Para suplir estas limitaciones, es pertinente generar redes semánticas que describan información sobre las comunidades víctimas del conflicto armado. Lo anterior permitirá generar un insumo que las instituciones y autoridades locales puedan emplear para reconocer pilares en la formación de políticas públicas coherentes a las necesidades de cada comunidad. En el presente artículo, se expone el proceso para la generación semiautomática de una red semántica a partir del tratamiento de datos textuales. Para tal fin, se han empleado herramientas de minería de textos y técnicas de análisis multivariado. La red semántica generada es una primera aproximación para la descripción de las características del caso de estudio de la comunidad de Arauca durante los años 2013-2018.


ABSTRACT The situation of violence in Colombia has produced dynamics that are beyond the control of the State and make it difficult for state institutions to build the public agenda. These institutions have limitations to describe the social and political environment that influence the territory To overcome these limitations, it is pertinent to generate semantic networks that describe information about communities victims of the armed conflict and, thus, generate an input that local institutions and authorities can use to recognize pillars in the formation of public policies consistent with the needs of each community. In this document is presented the process for the semi-automatic generation of a semantic network from the processing of textual data, using text mining tools and multivariate analysis techniques. The semantic network generated is a first approximation for the description of the characteristics of the Arauca community during the years 2013-2018, which was selected as a case study.


RESUMO A situação de violência na Colômbia tem produzido dinâmicas que estão além do controle do Estado e dificultam a construção de uma agenda pública por parte das instituições estatais, já que estas têm limitações na descrição do ambiente social e político que influencia o território. Para superar essas limitações, é pertinente gerar redes semânticas que descrevam as informações sobre as comunidades vítimas do conflito armado. Isto permitirá gerar um input que as instituições e autoridades locais possam utilizar para reconhecer pilares na formação de políticas públicas que sejam consistentes com as necessidades de cada comunidade. Neste artigo, é apresentado o processo para a geração semiautomática de uma rede semântica baseada no processamento de dados textuais. Para isso, foram utilizadas ferramentas de mineração de texto e técnicas de análise multivariada. A rede semântica gerada é uma primeira abordagem para descrever as características do estudo de caso da comunidade Arauca durante os anos 2013-2018.

13.
Rev. habanera cienc. méd ; 18(4): 678-692, jul.-ago. 2019. tab, graf
Article in Spanish | LILACS-Express | LILACS | ID: biblio-1093895

ABSTRACT

RESUMEN Introducción: Especialistas de la Facultad de Psicología de la Universidad de La Habana propusieron el cuestionario sobre Bienestar Humano Personal, Laboral y Social (BHPLS), que se aplicó a 135 trabajadores cubanos de tres grupos sociolaborales. Dada la variedad de respuestas, se impuso un análisis de contenido (AC) para la Pregunta 1 del cuestionario. Objetivo: Proponer e implementar un software que permita la categorización semiautomática en un AC para dicha pregunta. Material y Métodos: Se utilizó el índice de concordancia Kappa para evaluar el acuerdo entre expertos respecto al esquema de categorías. Se implementó un software en el lenguaje de programación Python para cumplir el objetivo, considerando las funcionalidades de softwares similares. Resultados: Se implementó, validó y registró un software "BHPLS data processing-UH®" que permite establecer las categorías, cargar los datos, categorizarlos semiautomáticamente y guardar el resultado, entre otras funcionalidades. La categorización manual con estudiantes de Psicología obtuvo un índice de concordancia Kappa negativo (bajo acuerdo entre expertos), mientras que usando el software propuesto, se alcanzó un Kappa global 0.7871 con p=0.00 (alta concordancia y alta significación estadística). Además, se propuso un algoritmo para la unificación de las categorizaciones de expertos y se ejecutó un Análisis de Correspondencias (ANACOR) sobre la combinación de categorizaciones obtenidas. Conclusiones: Dada la alta concordancia alcanzada, se recomienda el uso del software por su adaptabilidad, facilidad de uso y la "humanización" del AC. El ANACOR permitió observar similitudes entre los grupos sociolaborales. Las funcionalidades del software pueden aplicarse para el procesamiento de asociaciones libres en otros escenarios.


ABSTRACT Introduction: Experts of the Faculty of Psychology of the University of Havana proposed the Personal, Labor and Social Human Well-being questionnaire (BHPLS, in Spanish), that was applied to 135 Cuban workers of three social and occupational groups. Given the variety of responses, a content analysis (CA) was used for Question 1 of the mentioned questionnaire. Objective: To present and implement a software that allows a semi-automatic categorization in a CA used for this question. Material and Methods: The Kappa index test was used to evaluate experts´ agreement with respect to category schemes. We implemented a software with the Python programming language to achieve our objective, considering other similar software functionalities. Results: We implemented, validated and registered the software BHPLS data processing-UH® that allows to set up a categories system, load the collected data, categorize associations in a semi-automatic way, and save the results, among other functionalities. This software was validated by Psychology students and, when they performed the manual categorization, a negative Kappa agreement index (low categorization agreement between experts) was obtained whereas using the proposed software, a global Kappa index of 0.7871 with p=0.00 (high and statistically significant categorization agreement between experts) was obtained. Besides, we proposed a unified algorithm for expert's categorizations, and carried out a Correspondence Analysis (ANACOR) on the basis of the categorizations achieved. Conclusions: According to the high concordance attained, we recommend the software due to its adaptability, ease of use, and "humanization'' of the process. The CA allowed us to observe similarities in social and occupational groups. The software functionalities can be applied for processing free associations in other scenarios.

14.
Korean Journal of Occupational Health Nursing ; : 221-229, 2019.
Article in Korean | WPRIM | ID: wpr-786325

ABSTRACT

PURPOSE: The aim of this study was to identify core keywords and topic groups of workplace bullying researches in the past 10 years for better understanding research trend.METHODS: The study was conducted in four steps: 1) collecting abstracts, 2) extracting and cleaning semantic morphemes, 3) building co-occurrence matrix and 4) analyzing network features and clustering topic groups.RESULTS: 437 articles between 2010 and 2019 were retrieved from 5 databases (RISS, NDSL, Google scholar, DBPIA and Kyobo Scholar). Forty-one abstracts from these articles were extracted, and network analysis was conducted using semantic network module. The most important core keywords were ‘turnover’, ‘intention’, ‘factor’, ‘program’ and ‘nursing’. Four topic groups were identified from Korean databases. Major topics were ‘turnover’ and ‘organization culture’.CONCLUSION: After reviewing previous research, it has been found that turnover intention has been emphasized. Further research focused on various intervention is needed to relieve workplace bullying in nursing field.


Subject(s)
Bullying , Data Mining , Intention , Korea , Nursing , Semantics
15.
Journal of Rural Medicine ; : 36-41, 2019.
Article in English | WPRIM | ID: wpr-750894

ABSTRACT

Purpose: The aim of this study was to clarify psychosocial factors supporting elderly men who were living alone in a heavy snowfall area where the population aging rate exceeded 40%.Methods: The authors conducted semi-structured interviews with six elderly men living alone. As the method of analysis, we conducted a hierarchical cluster analysis of the contents of the interviews via text mining.Results: As a result, we found the psychosocial factors supporting the elderly men living alone. We divided the factors into six categories: “well-planned roof snow removal”, “interaction with young people”, “realization of the meaning of life via driving”, “engagement in leisure and recreational activities”, “living a life aligned with personal preference” and “insistence on living alone”.Conclusion: Formal and informal networking that avoids debasing these psychosocial factors required for the continuance of living life alone is necessary.

16.
Genomics & Informatics ; : e14-2019.
Article in English | WPRIM | ID: wpr-763810

ABSTRACT

The total number of scholarly publications grows day by day, making it necessary to explore and use simple yet effective ways to expose their metadata. Schema.org supports adding structured metadata to web pages via markup, making it easier for data providers but also for search engines to provide the right search results. Bioschemas is based on the standards of schema.org, providing new types, properties and guidelines for metadata, i.e., providing metadata profiles tailored to the Life Sciences domain. Here we present our proposed contribution to Bioschemas (from the project “Biotea”), which supports metadata contributions for scholarly publications via profiles and web components. Biotea comprises a semantic model to represent publications together with annotated elements recognized from the scientific text; our Biotea model has been mapped to schema.org following Bioschemas standards.


Subject(s)
Biological Science Disciplines , Search Engine , Semantics
17.
Genomics & Informatics ; : e17-2019.
Article in English | WPRIM | ID: wpr-763807

ABSTRACT

Text mining has become an important research method in biology, with its original purpose to extract biological entities, such as genes, proteins and phenotypic traits, to extend knowledge from scientific papers. However, few thorough studies on text mining and application development, for plant molecular biology data, have been performed, especially for rice, resulting in a lack of datasets available to solve named-entity recognition tasks for this species. Since there are rare benchmarks available for rice, we faced various difficulties in exploiting advanced machine learning methods for accurate analysis of the rice literature. To evaluate several approaches to automatically extract information from gene/protein entities, we built a new dataset for rice as a benchmark. This dataset is composed of a set of titles and abstracts, extracted from scientific papers focusing on the rice species, and is downloaded from PubMed. During the 5th Biomedical Linked Annotation Hackathon, a portion of the dataset was uploaded to PubAnnotation for sharing. Our ultimate goal is to offer a shared task of rice gene/protein name recognition through the BioNLP Open Shared Tasks framework using the dataset, to facilitate an open comparison and evaluation of different approaches to the task.


Subject(s)
Benchmarking , Biology , Data Mining , Dataset , Machine Learning , Methods , Molecular Biology , Natural Language Processing , Oryza , Plants
18.
Genomics & Informatics ; : e19-2019.
Article in English | WPRIM | ID: wpr-763805

ABSTRACT

In this paper, we investigate cross-platform interoperability for natural language processing (NLP) and, in particular, annotation of textual resources, with an eye toward identifying the design elements of annotation models and processes that are particularly problematic for, or amenable to, enabling seamless communication across different platforms. The study is conducted in the context of a specific annotation methodology, namely machine-assisted interactive annotation (also known as human-in-the-loop annotation). This methodology requires the ability to freely combine resources from different document repositories, access a wide array of NLP tools that automatically annotate corpora for various linguistic phenomena, and use a sophisticated annotation editor that enables interactive manual annotation coupled with on-the-fly machine learning. We consider three independently developed platforms, each of which utilizes a different model for representing annotations over text, and each of which performs a different role in the process.


Subject(s)
Linguistics , Machine Learning , Natural Language Processing
19.
Genomics & Informatics ; : e20-2019.
Article in English | WPRIM | ID: wpr-763804

ABSTRACT

Entity normalization, or entity linking in the general domain, is an information extraction task that aims to annotate/bind multiple words/expressions in raw text with semantic references, such as concepts of an ontology. An ontology consists minimally of a formally organized vocabulary or hierarchy of terms, which captures knowledge of a domain. Presently, machine-learning methods, often coupled with distributional representations, achieve good performance. However, these require large training datasets, which are not always available, especially for tasks in specialized domains. CONTES (CONcept-TErm System) is a supervised method that addresses entity normalization with ontology concepts using small training datasets. CONTES has some limitations, such as it does not scale well with very large ontologies, it tends to overgeneralize predictions, and it lacks valid representations for the out-of-vocabulary words. Here, we propose to assess different methods to reduce the dimensionality in the representation of the ontology. We also propose to calibrate parameters in order to make the predictions more accurate, and to address the problem of out-of-vocabulary words, with a specific method.


Subject(s)
Dataset , Information Storage and Retrieval , Methods , Semantics , Vocabulary
20.
Medical Education ; : 160-168, 2019.
Article in Japanese | WPRIM | ID: wpr-758332

ABSTRACT

Abstract:Introduction: The purpose of this research is to measure the critical thinking (CT) skills of nursing college students before and after practical training, and examine whether situational factors such as purpose and context can affect judgments related to CT.Methods: We distributed 795 nursing students an anonymous self-administered questionnaire using the scale to assess the CT and free description type questions. The collected data was analyzed using statistical analysis and text mining analysis.Results: The effective response rate was 22.01% (n=175) before training and 22.26% (n=177) after practical training. The average score of the CT scale was 163.70±17.68 before training and 171.21±19.03 after practical training. Five categories were extracted from the open-ended questions and identified as situations in which CT in used in practical training.Discussion: The average score of the CT scale rose with practical training experience. The existence of the practical training experience have affected the total score of the CT scale.

SELECTION OF CITATIONS
SEARCH DETAIL